Overall structure
Assuming we have a database, it contains many nodes on paths from different environments to "success".
Then conduct a query of the current environment to find similar environments in the database.
There is a higher chance of finding environments that are closer to "success" because the number of "success" environments is much smaller than random environment.
Like a decision tree, there are fewer nodes at the lower level.
By removing old data and adding new "success" data, the database should converge to the "ideal" database.
Monte Carlo tree needs cutting "fail" leaves, and this process is like adding "success" leaves, so that randomness will not destroy the tree.
Oldest update at bottom
One of chi decision model, 0 means don't choose chi.
Some improvement on prediction of shanten.
This is use LSTM to predict shanten.
I think the result is good enough.
Importing 1M hand tiles, basic CNN can get 50% top choice = humanc choice, 20% second choice = human choice, 10% third choice = human choice.
Considering human reaction in "defense mode" is not predictable by hand tiles, it seems to be a good start.